Any plans for a Qwen3-32B model?
#9
by
wanghf
- opened
The most cost-effectiveness distillable model for now.
+1
No way, probably. Since qwen3 32b is with no base model
damm
No way, probably. Since qwen3 32b is with no base model
No way, probably. Since qwen3 32b is with no base model
still could fine-tuning on instruct model, just may forget original Qwen3 knowledge (overrided by R1-0528 dataset).
fine-tuning on instruct model, just may forget original Qwen3 knowledge (overrided by R1-0528 dataset).
The https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct is a good candidate and the Q3 30A3 would be awesome/sweet too.')
The Llama 8b distills are so useful, I hope this one finds its way to that.